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Introduction 1 


In 1982, Park and Schowengerdt [l] published an end-to-end analysis of a digital imaging system 
quantifying three principal degradation components (i) image blur - blurring caused by the acquisition 
system (ii) aliasing - caused by insufficient sampling and (iii) reconstruction blur - blurring caused by the 
imperfect interpolative reconstruction. This analysis, which measures degradation as the square of the 
radiometric error, includes the sample-scene phase as an explicit random parameter and characterizes 
the image degradation caused by imperfect acquisition and reconstruction together with the effects of 
undersampling and random sample-scene phases. In a recent paper Mitchell and Netravelli [3] displayed 
the visual effects of the above mentioned degradations and presented subjective analysis about their 
relative importance in determining image quality. 

The primary aim of the research in this paper is to use the analysis of Park and Schowengerdt 
[1],[8] to correlate their mathematical criteria for measuring image degradations with subjective visual 
criteria. Insight gained from this research can be exploited in the end-to-end design of optical systems, 
so that system parameters (transfer functions of the acquisition and display systems) can be designed 
relative to each other, to obtain the “best possible” results using quantitative measurements. 


Formulation 


In this section we present an end-to-end model of a digital imaging system. This model was used by 
Park and Schowengerdt [1] to derive expressions for the degradation caused by the various components 
of the system. 

The model upon which the results of this paper are based is described in Fig 1. The parameters 
u and v are explicit sample scene phase parameters which have the range of ± for pixels placed at 
unit distance from each other. The action of the imaging subsystem is described by the convolution 
(denoted by * ) of the system point spread function (PSF) h(x,y ) with the scene 

g(x — u,y — v) = h(x,y)* f(x - u,y - v) (1) 

The image is then sampled onto a cartesian grid. This sampling operation is represented symbolically 
as the multiplication of the image with the comb or Shah function 

g s (x, y,u,v) = 9 (x ~u,y- v)6(x - m,y - n) (2) 

m n 

The notation g s (x,y;u,v) expresses the fact that the sampling subsystem is not shift-invariant. 

'This paper refers to research described in references 1-8. 


11 


Scene 


Image 


Sampled 

Image 


Reconstructed 

Image 



Figure 1: An Imaging, Sampling and Reconstruction System 


We take the point of view that the reconstruction filter r(x, y) is designed so that the reconstructed 
image is an accurate reproduction of the output of the imaging system. The reconstructed image is 
compared to the image g and not the scene / ; thus, the reconstruction filter typically does not attempt 
to perform any restoration. 


Reconstruction is also symbolically modeled as a convolution operation 

g r (x,y,u,v ) = r(x,y ) * g s (x,y;u,v) (3) 

Park and Schowengerdt measure the accuracy of reconstruction as the mean square radiometric error 
and define the term 

/ oo r oo 2 

/ [g(x - u,y - v) - g r (x,y,u,v)] dxdy (4) 

-OO J — OO 

and analogously 

ro o r oo 2 

e] = / / [f(x - u,y - v) - g(x - u,y - u)\ dxdy (5) 

where e| R and e] measure the sampling-reconstruction degradation and image blur respectively. As 
suggested by the notation, image blur is independent of the sample scene phase due to the shift 
invariance of the convolution operation. The sampling- reconstruction degradation is not. 


Fourier analysis yields equivalent expressions for c| R and e] in the frequency domain. Park et al. 
showed that 


t = r / 

J — oo J — oo 


| 1 - h(v x ,Vy ) I 2 1 f{v x ,Vy) I 2 du x du y 


(6) 


where {v x ,v y ) are spatial frequencies (units of cycles per sampling interval), h{u x ,v y ) is the imaging 
subsystem OTF (optical transfer function) and | /( o x ,o y ) \ is the magnitude of the transform of the 
scene. 


The corresponding expression for the sampling and reconstruction degradation is given in terms of 
an ensemble of scenes formed by varying the sample scene phase parameters uniformly over their entire 
range. Thus, we obtain the expected value of this degradation in the form 
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^ i^y) I hi^xiVy) | | f{^xi^y) \ d,U x dUy 
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( 7 ) 
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where the term e 2 {v x ,v y ) accounts for the effects of imperfect reconstruction and undersampling and 
is given by 

e 2 (u x ,v y ) = \ 1 - r(v x ,v y ) | 2 + ^ \ r(v x - m,v y - n) \ 2 (8) 

(m,n) ^(0,0) 

where r(v x ,v y ) is the reconstruction fdter, i.e. the Fourier transform of r(x,y). jE'[c| r ] can be written 
as the sum of two terms, 

£[ 4*1 = 4 + 4 0 ) 


where 


and 
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h ( Et? V y ) f{V x , Vy ) | (lV X dVy 


( 10 ) 


(11) 


m,n ^0,0 

The term e 2 R accounts for imperfect reconstruction while c 2 s accounts for aliasing due to undersampling. 


Analysis and Visual Perception of Image and SR Blur 


Image blur is caused by the non-ideal frequency response of the imaging subsystem. Eq. (6) is a 
mathematical statement of this fact. Almost invariably, the frequency response of an imaging system 
approaches zero at high frequencies and thus this subsystem acts as a low-pass filter. Image blur alone, 
uncoupled from sampling and reconstruction blur, is perceived as a loss of high frequency detail in the 
scene. 

The average sampling and reconstruction blur, as suggested by Eq. (7) is caused by inadequacies in 
both the sampling and the reconstruction subsystem. The sampling contribution to this degradation is 
expressed by Eq. (11) which states that aliasing is caused by the presence of significant image energy 
at frequencies where the energy in the reconstruction filter sidebands 

EE i m ' v v - n ) i 2 ( 12 ) 

m,n ^0,0 

is not zero. This is illustrated in Fig 2 where the replicas of the image spectrum (formed by sampling) 
overlap and the reconstruction filter cannot isolate a pure version of the base-band spectrum. This 
type of degradation is sometimes called prealiasing [3] and will always be present if the image is not 
sufficiently sampled, even with perfect reconstruction. 

Even when the replicated spectra do not overlap (i.e the image has been sufficiently sampled), image 
quality may suffer due to poor reconstruction, as illustrated in Fig 3. In this case, the response of the 
reconstruction filter is too broad and thus the reconstructed signal includes some (high) frequencies 
not present in the original image. This type of aliasing is sometimes called ]x)staliasing [3]. When the 
image spectrum has significant power at frequencies very near the Nyquist (cutoff) frequency (i.e the 
image spectrum and its nearest replica come very close to each other), the design of the reconstruction 
filter becomes difficult as the roll-off has to be very sharp (resulting in a filter with a very large kernel 
in the spatial domain). This problem has been noted by several researchers [2], [3], 

Pre- and postaliasing are often perceived as artifacts in the reconstructed scene [3]. However, 
it should be noted that in general, an absence of artifacts does not imply that there is no pre or 
postaliasing. Aliasing can manifest itself as blurring as well (due to attenuation of the high frequencies 
in the scene or image spectrum) and is almost impossible to differentiate from image blur. 
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Figure 4: Baseband attenuation resulting from imperfect reconstruction 

In addition to removing the sidebands of the signal spectrum, the reconstruction filter also needs to 
pass the original image spectrum base-band with minimal distortion (Fig 4). Eq. (10) states this idea 
formally. It measures the contribution to SR blur caused by the presence of significant image energy, 

I h(v x ,v y )f(v x ,Vy) | 2 at frequencies where r(v x ,v y ) ± 1. This type of reconstruction error is known as 
base-band attenuation. This is analogous to image blur in the sense that the reconstruction filter acts 
as a low-pass filter resulting in a loss of high-frequency detail in the reconstructed image. 

The problem of designing a good reconstruction filter is made difficult because an “ideal” filter is 
a sine filter in the spatial domain. The sine is quasi-ideal in the sense that a. signal can be perfectly 
reconstructed from its samples by using sine-interpolation only if the signal is bandlimited and suffi- 
ciently sampled. However, the sine is impossible to realize in practice and finite approximations to it 
produce an effect commonly known as ringing. Ringing is perceived as rippling patterns radiating from 
high contrast edges [3] and is strongly suggested by the form of the impulse response of the sine. 

Another problem in designing a reconstruction filter is the problem of sample-frequency ripple. This 
problem can be best understood in terms of a uniformly gray image which is sam pled and reconstructed 
to yield an image where the gray-level uniformity is destroyed. This is often perceived as spurious 
patterns on the background in an image. To eliminate this problem, it is necessary to design the 
reconstruction filter r(v x ,v y ) so that the equation 

~ m,y - n) = 1 (13) 

m n 

is satisfied. 

An important point to note in this discussion is that even though it is possible (at least in theory) 
to minimize image blur and sampling-reconstruction blur individually by suitable filter design, in an 
end-to-end system the subsystems cannot be designed in isolation from one another to minimize both 
image and sampling-reconstruction blur simultaneously. Eq. (6) suggests that image blur is minimized 
when h(o x ,o y ) = 1 for all frequencies where there is non-zero scene energy. However, from Eq. (8) we 
see that the average sampling-reconstruction blur will be minimized when h(v x ,v y ) = 0 at all those 
frequencies where the reconstruction filter does not have unit (perfect) response. These are conflicting 
requirements and a compromise has to be achieved based on the relative visual importance of the two 
types of degradation. 
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It has been observed [6], [7] that the response of human viewers to various spatial effects of filters is 
subjective. Filters that result in some aliasing and base-band attenuation have sometimes been observed 
to yield results which are visually pleasing to human viewers. There is evidence [6] that suggests that a 
moderate amount of ringing can “improve” the visual quality of an image by introducing an illusion of 
sharpness (high frequency), although in terms of the amount of degradation (which can be measured 
by Eq. 6 - 11), this may correspond to a higher mean squared error. 

In our research, we attempt to correlate the mathematical criteria for optimal end-to-end processing 
with subjective visual testing for Gaussian transfer functions. There is evidence that people prefer 
some aliasing and ringing (which give an illusion of sharpness), but that people are sensitive to high- 
frequency suppression (blurring). The primary motivation of this study is to assess the effect of each of 
these individual degradation components on the quality of the reconstructed image. This research has 
application to the design of end-to-end imaging systems where the components can be tuned to obtain 
the best possible results. In the next section we describe our models for the various components of the 
system and simulation results. 


Imaging and Reconstruction System Transfer Function Models 


In order to simulate an end-to-end imaging system it is necessary to associate a model with the imaging 
(camera) subsystem and the reconstruction (display) subsystem;i.e,we need to assign a functional form 
to both h( v x ,v y ) and r(v x ,v y ). In the discussion that follows, we refer to h as the Camera Transfer 
Function (CTF) and f as the Display Transfer Function (DTF). In our analysis we have chosen to 
model the CTF and DTF as Gaussian functions of the form 

e -[pr(^+^)] ( 14 ) 

where r is the parameter which controls the spread of the function. It can be shown that r is propor- 
tional to the standard deviation of the Gaussian function (av). They are related as 

r = \Fla v (15) 

The two-dimensional Gaussian function is separable and its Fourier transform is also a Gaussian. 

In our model, the Gaussian is symmetric in the two dimensions resulting in the filter kernels being 
circularly symmetric. Thus, only a single parameter (r) is required to characterize each of the Gaussian 
functions representing the CTF and the DTF. Thus, due to the duality of the Gaussian and its Fourier 
transform, a broad frequency response can be achieved by a very small kernel in the spatial domain 
and vice versa. 

The reason for modelling the CTF and the DTF as Gaussian functions is primarily due to the its 
popularity amongst designers of optical instruments [4], [5]. Several variations of the pure Gaussian 
(e.g. sharpened Gaussian and the sum of two Gaussians [4]) have been used as models for the transfer 
functions, especially for interpolative reconstruction systems. In our end-to-end model, we have three 
system parameters - the sampling rate and the standard deviations of the Gaussian camera and dis- 
play transfer functions. These parameters can be varied to influence both image and sampling and 
reconstruction blur. End-to-end simulation using these models for the imaging and reconstruction sub- 
systems thus allows us to study the interplay between the various degradations discussed in the previous 
section and correlate mathematical results (blur coefficients) with subjective (visual) judgements about 
image quality. The primary goal of this simulation study is to identify a relationship between the two 
parameters which will result in the best possible reconstructed image. Such a relationship can then be 
used as a design rule for end-to-end systems employing scanning and interpolative reconstruction. 
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Simulation Results 


The numerical simulation of the imaging system described in Fig 1. has been performed on two images 

- one of cat’s face and the other of the central portion of a dollar bill. The images are 512 x 512 pixels 
in dimension and are quantized to 16 bits/gray-level. For the purpose of display, these images have 
been rescaled into a gray-level range of 0 to 255 (8 bits/gray-level). 

In the simulation of the imaging process, the Fourier spectra of these scenes have been multiplied 
with a Gaussian CTF to produce the corresponding image spectra. The reverse Fourier transform of 
the image spectra produces the corresponding image in the spatial domain. It is with this image that 
the final reconstructed scene is compared to judge the quality of reconstruction. 

The sampling subsystem has been simulated by sub-sampling the 512 x 512 image down to 128 x 
128. A uniform sampling scheme has been chosen primarily due to its simplicity as well as its popularity 
amongst designers of (digital) optical equipment. 

Finally, the reconstruction process is implemented in a manner similar to the imaging process. The 
sampled images have been enlarged to 512 x 512 by zero-filling and their Fourier spectra have then 
been multiplied with a Gaussian DTF to produce the spectra of the reconstructed scenes. The inverse 
Fourier transform is then applied to these spectra to produce the reconstructed scenes in the spatial 
domain. 

The parameter of interest for the transfer functions is r (Eq. 14.), which controls the standard 
deviation of the Gaussian functions used to model the CTF and the DTF. In order to study the 
degradation caused by these two subsytems, rcTF and t DTF have been varied over a range of values - 
the range selected is standardized with respect to the size of the sampled image (128 x 128). 

The reconstructed scenes have been evaluated by about 20 people and the degradation values cor- 
responding to their choice of the best possible reconstruction are shown on the corresponding plots. 
The observers were first shown the images after they were passed through the imaging subsystem and 
were then asked to find the most faithful reconstruction from the collection of processed images for 
different system parameters. The candidate images were displayed in a random order to eliminate any 
positional bias that may have been present. The contrasts of these images were also strictly matched 
to eliminate any contrast bias. 

Fig 5 shows the cat image after being passed through the acquisition phase of an imaging system. 
Figs 6 - 8 show reconstructed images of this acquired image with different display subsystems. Figs 9 

- 14 are plots of the different error components for the dollar and the cat images. These values have 
been calculated using Eq.(6)-(11) and the horizontal axis of the plots refer to a fraction of 128 which 
represents the value of tqtf- Figs 15-17 show several processed versions of the dollar bill image. 
Fig 15 shows the original dollar bill which serves as the scene in our simulations. Fig 16 and Fig 17 
show results from two opposite ends of the processing spectrum - Fig 16. shows the excessive blurring 
introduced by the narrow (frequency domain) transfer functions, while Fig 17 exhibits the characteristic 
sample-frequency ripple associated with a wide (frequency domain) display transfer function. 

The preliminary results of the visual testing have yielded interesting “observations” about the visual 
impact of the different kinds of degradations that are inevitably introduced in a nonideal end-to-end 
imaging system; 18 out of the 20 observers chose the reconstructed image corresponding to rcTF — 76.8 
(i.e. CTF FACTOR = 0.6 in Fig 7) as the “best” reconstructed scene for the cat image. From the 
plot it is clear that these values of the parameters correspond to a situation where the total degradation 
is dominated by the sampling (or aliasing) blur. This reinforces the belief that the human eye is more 


critical of blurring (of any kind) than other types of degradations which introduce some high frequency 
features which are not present in the original image. None of the observers selected those images for 
which the image blur is the dominant degradation term. In particular, the sample-frequency ripple 
effect (which manifests itself as a fine wire mesh over the images ) helped to a certain extent to create 
an illusion of feature (edge) sharpness that made the observers select images with a moderate amount 
of this effect as the “best” images. 

The subjective evaluation suggests that to the untrained human eye, image blur (i.e. any supression 
of high frequencies) is often more annoying than sampling artifacts which may create an illusion of 
sharpness. However, much work still needs to be done. In planned extensions, we will control viewing 
conditions more stringently to eradicate some biases that may be reflected in our current results. We 
plan to use a digital monitor instead of film since we have experienced a great deal of contrast and 
texture variability with film. There is also the need for more exhaustive testing (more scenes with a 
greater variation in the frequency content etc.) under more controlled conditions. The end-to-end model 
must be improved to incorporate more sophisticated models of the acquisition and display subsystems 
as well as physcophysical parameters such as the contrast sensitivity function. The sophistication of 
the human subjects with respect to digital image processing fundamentals may also be a significant 
bias factor when testing certain images. Finally, the whole experiment would be incomplete unless an 
end-to-end (initial scene to the reconstructed scene) simulation is performed and the analytical results 
are correlated with visual testing. 
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Reconstructed scene with 


Fig. 5 : Original CAT image with 

CTF FACTOR=0.6 and no SR degradation 


CTF FACTOR=0.6 and DTF FACTOR=0.3 
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Fig 8 : Reconstructed scene with 


CTF FACTOR = 0.6 and DTF FACTOR=0.6 


CTF FACTOR=0.6 and DTF FACTOR=0 


(chosen as the best reconstruction of 
Fig 5. by selected observers) 
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Figure 9: Image and SR Blur vs CTF FACTOR (Dollar) 
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Figure 10: Image and SR Blur vs CTF FACTOR (Dollar) Figure 11: Image and SR Blur vs CTF FACTOR (Dollar) 
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